HDV nucleic acid molecules and applications thereof

ABSTRACT

The invention concerns nucleic acid molecules derived from novel hepatitis D virus strains or isolates constituting genotypes different from known I, II and III genotypes, their fragments, corresponding proteins and their uses as diagnostic reagents. The invention also concerns a method for sensitive diagnosis of the hepatitis D virus (or delta hepatitis virus) and a method for epidemiologic monitoring of HDV-related infections.

CROSS-REFERENCE TO RELATED APPLICATIONS

The present application is a 371 application of PCT/FR02/03239 filed Sep. 23, 2002 and which also claims the benefit of FR 01/12285 filed on Sep. 24, 2001.

The present invention relates nucleic acid molecules derived from novel hepatitis D virus strains or isolates constituting genotypes different from the known genotypes I, II and III, and to their fragments, to the corresponding proteins, and also to their uses as diagnostic reagents.

The present invention also relates to a method for sensitive diagnosis of the hepatitis D virus (or delta hepatitis virus) and to a method for epidemiological monitoring of HDV infections.

The hepatitis D virus (HDV) or delta hepatitis virus is a hepatitis B satellite virus. This virus has a specific structure: chimeric structure associating with the specific HDV components (viral RNA and HD proteins), an envelope comprising the three HBV glycoproteins: large (preS1-preS2-S), medium (preS2-S) and small (S). The average diameter of HDV particles is between that of mature HBV particles (Dane particles: 42 nm) and that of HBV empty envelopes (spherical or filamentous form: 22 nm) and the flotation density is 1.24-1.25 g/cm³.

In the virions, the HDV RNA is circular and of negative polarity. This closed circular single strand, the smallest known genome of viruses which infect mammals, has a high GC percentage (60%).

The HDV RNA replicates independently of HBV, the role of which is limited to providing the envelope of HDV. The only proteins found (sHD and LHD) are encoded by the antigenomic RNA which, in the infected cell, is complete, circular and pseudo-double-stranded, serves as a replication intermediate and is the target for editing.

The HDV RNA belongs to a specific type of ribozyme. The self-cleavage reaction requires the RNA and a divalent cation (Mg⁺⁺). The cleavage creates a 2′,3′-cyclic phosphate end and a hydroxyl 5′ end. Delta ribozymes (genomic and antigenomic) have a similar pseudoknot secondary structure. The sequences involved include mainly or exclusively sequences located 3′ of the self-cleavage site (approximately 84 nucleotides).

During the viral cycle the HDV mRNA encodes a protein, two forms of which exist: a 194-195 amino acids protein (‘s’ form for small) of 24 kilodaltons (kDa) and a 214 amino acids protein (‘L’ form for large) of 27 kDa, which exist in varying proportions. These proteins carry the ‘delta’ antigenicity and are detected in the liver or the serum of infected patients or animals (chimpanzee, marmot). These two viral proteins sHD and LHD are initiated at the first ATG of the open reading frame located at position 1598 (according to the numbering of Wang et al., 1986 or 1987) of the antigenomic RNA. During replication, a mutation, dependent on a cellular enzyme, ‘double-stranded RNA-dependent adenosine deaminase’ appears at position 1012, converting the amber stop codon (UAG) into a tryptophan codon (UGG), extending the reading frame by 19 or 20 codons in the 3′ direction, and conferring different properties on the two forms sHD and LHD.

The mRNA terminates with a poly(A) tail, 15 nucleotides after the polyadenylation consensus signal AAUAAA (positions 954-959).

In the replication cycle, the functions of the 24 and 27 Kd proteins are opposite: sHD activates viral replication, whereas LHD suppresses it and plays a role in assembly of the viral particles. These proteins are phosphorylated on serine residues but not glycosylated (Table I). They consist of common functional domains and of a domain specific to the large protein LHD.

TABLE I Summary and comparison of the functions of the two forms p24 and p27 Biochemical and biological activities p24 (S) p27 (L) amino acids 195 214 transactivation of replication + − transinhibition of replication − + dimerization and polymerization + + RNA binding + + RNA stabilization + + nuclear localization + + assembly − + phosphorylation + + (×6) 19 specific carboxy-terminal aa − + farnesylation − +

Briefly, the various domains of these two proteins are as follows:

-   -   Common Domains     -   The polymerization domain, which comprises the sequence between         amino acid residues 13 and 48, made up of an arrangement of         leucine or isoleucine, organized in a “leucine zipper”-type         α-helix, involved in protein polymerization, essential for (i)         transactivation of viral replication by the sHD-Ag, (ii)         inhibition of replication by the LHD-Ag and (iii) assembly of         the sHD-LHD complexes in HBV envelopes.     -   The nuclear localization signal (NLS), which involves two         nuclear localization sequences identified in the 67-88 region,         essential for translocating the sHD-Ag synthesized in the         cytoplasm, and perhaps the ribonucleoprotein after its entry         into the cell, to the nucleus.     -   The RNA-binding site which is based on two arginine-rich         sequences located between residues 97 and 163, which allow         binding of the sHD proteins to the genomic or antigenomic RNA.         This binding is essential for the sHD-Ag to activate         replication.     -   Specific Domains

The 19-20 amino acids located at the COOH end of the large protein have an important role in the HDV cycle. Specifically, these amino acids (aa 195-214) are involved in assembly of the viral particles (Chang et al., 1991). This activity could be partly linked to the presence of a cysteine at position 211 (Glenn et al., 1992), which is conserved for all viral genomes characterized to date. This cysteine, located 4 amino acids before the COOH end of the protein, forms a “CXXX” box and attaches a farnesyl group (Glenn et al., 1992), a 15 carbon chain derived from mevalonic acid, through the action of a farnesyltransferase. This post-translational maturation directs the proteins to the cell membranes.

The small and the large protein have, moreover, been differentiated with monoclonal antibodies (clone 9E4) (Hwang and Lai, 1993a). These antibodies only recognise sHD (Lai et al., 1993). Since the amino acid sequence of the small protein is included in the large protein, these results suggest a difference in conformation between sHD and LHD within the 30 carboxy-terminal amino acids of the small protein sHD, suggesting that the epitope recognised on sHD is masked in LHD under non-denaturing conditions.

HDV is transmitted especially via contaminated needles and blood, and therefore via HDV or HBV carriers.

In North America and in Western Europe, hepatitis D is therefore found especially in intravenous drug users, hemophiliacs and individuals who have received multiple transfusions.

The epidemiology and the methods of contamination partially superimpose one another. It is estimated overall that the proportion of HBs-Ag carriers infected with HDV is 5%. However, disparities in geographical and epidemiological prevalence are noted.

A high prevalence of this disease, in hepatitis B virus carriers, exists in certain regions of the world, including the Amazon Basin of South America, central Africa and southern Italy, and in the countries of the Middle East.

In the Mediterranean region, most particularly in southern Italy, in Greece and in the Middle East, where the frequency of chronic HBV carriers is intermediate (1% to 5%), infection with HDV is high. In these regions, intrafamily transmission has been suggested, argued on the basis of phylogenetic studies of virus infecting members of the same family (Niro et al., 1999). In southern Italy, the prevalence in HBS-Ag-positive individuals is decreasing, dropping from 23% in 1987 to 8% in 2000 (Gaeta et al., 2000).

In Africa and in Asia, where the frequency of chronic HBV carriers is high (10% to 20%), and also in South America and in the Pacific Islands, where this frequency is intermediate (1% to 5%), the distribution of HDV is paradoxically disparate. In Africa, seroprevalence studies show a very heterogeneous distribution of patients having anti-HD antibodies, whereas the overall prevalence of HBV infection, estimated by detecting HBs-Ag, stabilizes between 12 and 14% (Roingeard et al., 1992). Thus, varying levels of 4% (northern region of Senegal) to 44% (Dakar suburbs) reveal probable socioeconomic factors involved in transmission.

HDV prevalence studies should be interpreted carefully. This is because, in the populations studied, there is a preferential inclusion of patients suffering from hepatopathies. In patients suffering from acute to chronic hepatitis, the prevalence of HDV infection is greater than in chronic asymptomatic HBV carriers. In addition, the serological investigation of an HDV infection is based on the detection of HD-Ag and of total anti-HD antibodies in the serum. As a result, acute benign infections, during which an isolated transient production of anti-HD IgM would develop, would not be registered.

HDV is responsible for acute and chronic forms of hepatitis. These infections are particularly serious and evolve more rapidly to cirrhosis than hepatitis B alone. This is one of the reasons for which the reliable diagnosis of HDV associated with HBV is crucial.

Infection with an HDV is dependent on HBV. HDV isolates from different geographical regions show genetic variability. Currently, three genotypes have been identified and named genotype-I, -II and -III.

The genotype is used for the viral transmission epidemiology, makes it possible to study the geographic distribution and might be correlated with the pathogenic potency.

HDV only develops in patients also infected with HBV. This double infection ensues either from a co-infection or from a superinfection:

-   -   Co-infection is the cause of an acute hepatitis. The diagnosis,         invoked during hepatic cytolysis, is based on the detection of         markers for HDV associated with the presence of anti-HBc IgM.         The HBs-Ag, which is generally present, is exceptionally         negative, justifying repetition of the samples in order to         monitor the kinetic of evolution of the markers. It is         conventional to observe an inhibition of HBV replication by HDV.         The anti-HBc IgMs reflect the recent infection with HBV. The         HD-Ag, which is very early, is rarely detected given its         transient nature. The antibodies appear 2 to 3 weeks after the         beginning of symptoms: anti-HD IgMs are predominant, but the         titer thereof remains moderate (<1:1000). Two transaminase         elevation peaks, separated by two to five weeks, are observed in         10 to 20% of co-infections, probably reflecting different viral         replication kinetics. Co-infection is therefore characterized by         the acute hepatitis often being more severe than that caused by         HBV alone. Thus, fulminant hepatitis is described in South         America and in sub-Saharan Africa or in certain populations. The         progression is generally marked by resolution of the hepatitis         after the acute phase and, in the image of the natural history         of HBV, only 5% of co-infected patients progress to a chronic         form of the disease.     -   Superinfection is characterized by the appearance of an HDV         seroconversion in a patient who is a chronic HBs-Ag carrier. The         HDV viremia precedes the appearance of anti-HDV antibodies in         the absence of detection of anti-HBc IgMs. The detection of         these markers may precede an increase in transaminases by         several months. In the acute phase, the superinfection results         in fulminant hepatitis in more than 10% of cases. In addition,         once the acute phase has passed, the superinfection frequently         (60 to 70%) results in chronic active hepatitis with rapid         progression to cirrhosis. In the acute phase of the         superinfection, detection of the HD-Ag is rapidly followed by         the appearance of antibodies, which persist at high levels.         Unlike the conventional models of viral infection, anti-HD IgG         and anti-HD IgM are simultaneously detected in chronic hepatitis         B-delta.

Table II below summarizes the evolution of the B and delta markers during co-infections and superinfections.

Evolution Acute phase Chronic Recovery Co-infection with HDV HBs-Ag + + − anti-HBc IgM + − − HD-Ag +/− − − anti-HD IgM +/− + − anti-HD IgG +/− + + HDV RNA + + − intrahepatic HD-Ag + + − Superinfection with HDV HBs-Ag + + − anti-HBc IgM − − − HD-Ag +/− − − anti-HD IgM + + − anti-HD IgG + + + HDV RNA + + − intrahepatic HD-Ag + + −

Co-infection and superinfection are clinically indistinct. The virological diagnosis is usually based on the various serum markers. More rarely, the HD-AG can be detected on the anatomical/pathological liver biopsy sections.

The markers make it possible to follow the progression of the disease to recovery or to a chronic form, to decide upon what treatment should be given to a patient and to evaluate the effectiveness thereof.

HDV cannot be isolated in cell culture and the diagnosis is therefore based essentially on the search for HD-Ag (ELISA, IF) or for the viral genome (hybridization, PCR, real-time PCR) for direct techniques and on the detection of anti-HD IgM and anti-HD IgG antibodies for indirect methods (ELISA).

-   -   The search for intrahepatic HD-Ag can be questioned in fulminant         hepatitis given the kinetics of appearance of the seromarkers.         This examination is of value as a reference for studying HDV         replication, but cannot be used routinely.     -   Serum HD-Ag is sought in the serum in the presence of a         dissociating agent which exposes the HD-Ag, included in the         viral envelope bearing the HBs-Ag. The presence of a high titer         of anti-HD antibodies (Abs) (chronic hepatitis) which bind the         serum antigens impairs the detection. Western blotting         techniques have been developed for research purposes. The         presence of the virus in the blood is transient and limited to         the early phase of infection, and the possibility of detecting         the HD-Ag decreases over the days following the appearance of         symptoms.     -   Immunocapture is used to detect anti-HD IgMs and competition for         anti-HD IgGs. The ELISA techniques first used as antigen the         HD-Ag from serum or from liver of infected patients or animals.         The new assays are based on recombinant HD-Ags or synthetic         peptides.     -   Hybridization or RT-PCR techniques make it possible to detect         the genomic RNA after extraction of the nucleic acids and         denaturation of the secondary structures. Several primer systems         have been described: the choice thereof is determinant since the         genetic variability in “conserved” regions can result in false         negatives if the primers chosen are not suitable for the         circulating viral strains. The choice of PCR primers should take         into account the local epidemiology of the genotypes described,         and it is essential to be fully aware of the distribution of         these genotypes throughout the world.

However, both in the case of co-infection and in the case of superinfection, the HD-Ag is in fact difficult to detect, although the viremia precedes the appearance of antibodies.

In this context, and in particular due to the demonstration of new genotypes, nucleic acid and protein reagents for diagnosing HDV, whatever the genotype, are needed.

In fact, the study of the nucleotide sequences of HDV by various teams around the world has made it possible to differentiate, until now, only three distinct genotypes:

-   -   genotype-I, which is the most common and the most widespread         throughout the world. Since the initial description         (experimentally infected chimpanzee) by Wang (Wang et al., 1986;         Wang et al., 1987), several groups have sequenced the genome of         HDV from different geographical isolates. The first sequence of         an HDV in humans was described in 1987, in the United States,         by S. Makino et al., in a patient who was a drug addict (Makino         et al., 1987). Genotype-I is very widespread in Italy, in the         United States, Taiwan, Nauru, France, the Lebanon, China (Makino         et al., 1987; Chao et al., 1991b; Imazeki et al., 1991; Lee et         al., 1992; Niro et al., 1997; and Shakil et al., 1997). Within         genotype-I, a percentage of nucleotide similarity of greater         than 85% is described.     -   A Japanese isolate (Imazeki et al., 1990; Imazeki et al., 1991)         is the prototype of a 2^(nd) subgroup of HDV. This genotype-II,         which has initially only been described in Japan and in Taiwan         (Imazeki et al., mentioned above; Lee et al., 1996b), appears to         have much wider geographical distribution. In particular,         genotype-I and genotype-II sequences originating from Yakutia         (Russia) have also been characterized. Finally, some authors use         an intragenotypic diversity as a basis for dividing genotype-II         into subtypes IIA (Imazeki et al., 1990; Imazeki et al., 1991;         Lee et al., 1996b), IIB (Wu et al., 1998; Sakugawa et al., 1999)         and IIC. In some countries, infection with genotype-II viruses         is thought to be associated with forms of hepatitis which are         less severe than those caused by genotype-I or -III HDVs (Wu et         al., 1995b).     -   In 1993, a 3^(rd) group was described for Peruvian and Colombian         virus genomes (Casey et al., 1993a). Genotype-III has only been         described in South America, and more particularly in the Amazon         Basin, associated with severe hepatitis, or even with epidemic         fulminant hepatitis with microvesicular steatosis (Casey et al.,         1993a; Casey et al., 1996b) and with high morbidity and         mortality. In this geographical region, it is observed that HDV         genotype III is preferentially associated with HBV genotype F.         Other isolates of this group have recently been isolated in         Venezuela (Nakano et al., 2001).

When comparing all the genomes, two to four conserved regions are described (Chao et al., 1991b). Two are consistently found and are centered around the self-cleavage sites of the genomes and antigenomes involved in the autocatalytic activity. The other two conserved regions are located in the reading frame encoding the HD protein (Chao et al., 1991b).

However, the detection techniques are dependent on the genetic variability of the virus sought; the known reagents, in particular based on the sequences specific for genotype-I, -II or -III, do not make it possible to detect infections with a variant HDV and in particular HDVs with a genotype different from those mentioned above.

Consequently, the detection techniques specified above risk giving negative results both at the nucleic acid level and in terms of the antibody response.

The revealing and the taking into account of novel variants are important for developing reagents for detecting and diagnosing hepatitis D (serodiagnosis, PCR, hybridization) which are sufficiently sensitive and specific, i.e. which do not produce falsely negative or falsely positive results: in fact, a positive anti-HD IgM/negative HDV RNA dissociation can, at the current time, be observed in the context of a severe hepatopathy.

In the context of their studies, the inventors have now demonstrated, surprisingly, that the genetic diversity of HDV is significantly greater than previously described, which has consequences for diagnostic reliability.

They have in particular demonstrated nine novel complete HDV sequences (three originating from Yakutia and six originating from Africa), which are also being passed around in the Ile de France region and which do not belong to any of the known genotypes.

Analysis of these novel isolates:

-   -   confirms the existence of a much greater variability of HDV than         that described to date,     -   calls into question the classifying of the HDVs in only three         genotypes,     -   has led the inventors to propose a PCR-RFLP algorithm based on a         partial region of the genome for HDV genotyping and     -   has led the inventors to develop reagents suitable for reliable         diagnosis of HDV infections, whatever the genotype, whereas         previously, many falsely negative results were observed         (existence of new genotypes).

The inventors have therefore given themselves the aim of providing HDV nucleic acid molecules capable of allowing the detection of a variant HDV with respect to the three genotypes previously described.

The subject of the present invention is therefore isolated nucleic acid molecules, characterized in that they are selected from the group consisting of:

-   -   the genome of an HDV, which in molecular terms exhibits, over         its entire genome, a genetic divergence or distance ≧20% (less         than 80% similarity) with respect to the sequences of an HDV         genotype I, of an HDV genotype II or of an HDV genotype III,     -   the genome of an HDV, which in molecular terms exhibits a         genetic divergence or distance ≧25% (less than 75% similarity),         over a region referred to as R0, delimited by positions 889 to         1289 of the HDV genome, with respect to the corresponding R0         sequences of an HDV genotype I, of an HDV genotype II or of an         HDV genotype III,     -   the complete genomes of the HDV isolates or variants referred to         as dFr45, dFr47, dFr73, dFr910, dFr48 and dFr644, which exhibit,         respectively, the sequences SEQ ID NOS: 1, 6, 11, 16, 21 and 26,         and     -   the genome of an HDV which exhibits a genetic divergence or         distance ≦15% with at least one of the sequences SEQ ID NOS: 1,         6, 11, 16, 21 and 26.

According to an advantageous embodiment of said molecules, the R0 region is preferably obtained by amplification of the HDV RNA with the primers 900S (SEQ ID NO: 33) and 1280AS (SEQ ID NO: 34).

For the purpose of the present invention, the term “nucleic acid molecule” is intended to mean a cDNA or RNA molecule exhibiting one of the HDV genomic sequences as defined above and the sense and antisense sequences complementary thereto.

A subject of the present invention is also nucleic acid molecules which comprise at least one of the fragments of the sequences of a variant HDV as defined above, selected from the group consisting of:

-   -   a) the R0 fragments of the following isolated variant HDVs:         dFr45, dFr47, dFr48, dFr69, dFr73, dFr644, dFr910, dFr1843,         dFr1953, dFr2020 and dFr2066 which exhibit, respectively, the         following sequences: SEQ ID NO: 48 to SEQ ID NO: 58,     -   b) the R1 fragment which extends from position 307 to position         1289 of the HDV genome,     -   c) the R2 fragment which extends from position 889 to position         328 of the HDV genome,     -   d) the R3 fragment which extends from position 1486 to position         452 of the HDV genome,     -   e) the R′1 fragment which extends from position 305 to position         1161 of the HDV genome,     -   f) the R′2 fragment which extends from position 984 to position         331 of the HDV genome,     -   g) the R644 fragment which extends from position 889 to position         446 of the HDV genome,     -   h) the G910 fragment which extends from position 1206 to         position 929 of the HDV genome,     -   i) the p910 fragment which extends from position 553 to position         1550 of the HDV genome,     -   j) the cDNAs encoding the sHD protein, of sequences SEQ ID NOS:         4, 9, 14, 19, 24 and 29,     -   k) the cDNAs encoding the LHD protein, of sequences SEQ ID NOS:         2, 7, 12, 17, 22 and 27, and     -   l) the primers of sequence SEQ ID NO: 33 to SEQ ID NO: 47.

For the purposes of the present invention, the positions of the fragments in the HDV genome are indicated on the circular genome in genomic orientation, according to the numbering of Wang et al., 1986 or 1987.

The invention also encompasses nucleotide fragments complementary to the above, and also fragments which have been modified with respect to the above, by deletion or addition of nucleotides in a proportion of approximately 15% with respect to the length of the above fragments and/or modified in terms of the nature of the nucleotides, provided that the modified nucleotide fragments conserve an ability to hybridize with the genomic or antigenomic RNA sequences of the isolates as defined above.

In fact, these various viral strains, in the same patient, at a given time, show a heterogeneous population of HDV RNA molecules; in addition, in the course of a chronic infection, in addition to the heterogeneities observed at the editing site (position 1012), mutations may appear. Viral sequences appear to evolve within viral populations with a variable substitution rate of 3×10⁻² to 3×10⁻³ per nucleotide and per year.

Some of these fragments are specific and are used as probes or as primers; they hybridize specifically to a variant HDV strain as defined above or to a related strain; the expression “HDV related to a variant as defined above” is intended to mean an HDV exhibiting a genetic divergence ≦15%.

Such fragments are used for the detection and the epidemiological monitoring of HDV infections. For example, the R0 fragment is used for the detection (RT-PCR) and the genotyping (PCR-RFLP) of HDV. The other fragments which cover the entire HDV genome are used for the molecular characterization of the variant HDVs; phylogenetic analysis of the complete sequence of the genome or of fragments thereof corresponding in particular to R0 or to R2 make it possible to link the profiles observed by PCR-RFLP to a given genotype or to characterize new genotypes.

Consequently, a subject of the present invention is also a method for detection of a variant HDV according to the invention, by hybridization and/or amplification, carried out from a biological sample, which method is characterized in that it comprises:

-   -   (1) a step consisting in extracting the nucleic acid to be         detected, belonging to the genome of the virus possibly present         in the biological sample,     -   (2) carrying out at least one gene amplification using a pair of         primers selected from the group consisting of the primers         capable of amplifying one of the following regions of the HDV         genomic RNA: R0, R1, R2, R3, R644, G910, p910, R′1 and R′2, and     -   (3) analyzing the amplified product by comparison with one of         the molecules of sequences SEQ ID NOS: 1, 6, 11, 16, 21 and 26,         corresponding respectively to the complete genomes of the         isolates or variants referred to as dFr45, dFr47, dFr73, dFr910,         dFr48 and dFr644.

Advantageously, the analytical step (3) can be carried out by restriction, sequencing or hybridization; in the latter case, the probe used (in particular in DNA chips) would advantageously be a fragment of 15 to 20 nucleotides, specific for said amplified fragments.

According to an advantageous embodiment of said method, the specific primers for amplifying the regions R0, R1, R2, R3, R644, G910, p910, R′1 and R′2, used in step (2), are selected from the group consisting of:

-   -   the primers 900S (SEQ ID NO:33) and 1280AS (SEQ ID NO:34), for         the amplification of R0 (approximately 400 pb),     -   the primers 320S (SEQ ID NO:39) and 1280AS (SEQ ID NO:34), for         the amplification of the R1 fragment (approximately 960 pb),     -   the primers 900S (SEQ ID NO:33) and 320AS (SEQ ID NO:45), for         the amplification of R2 (approximately 1100 pb), which contains         the sHD gene corresponding to positions 1598-950,     -   the primers 1480S (SEQ ID NO:46) and 440AS (SEQ ID NO:47), for         the amplification of R3 (approximately 650 pb),     -   the primers 900S (SEQ ID NO:33) and 420AS (SEQ ID NO:40), for         the amplification of the region R644 (approximately 1250 pb) of         the isolate dFr644,     -   the primers 318S (SEQ ID NO:35) and 1150AS (SEQ ID NO:36), for         the amplification of R′1 (approximately 850 pb),     -   the primers 960S (SEQ ID NO:37) and 345AS (SEQ ID NO:38), for         the amplification of R′2 (approximately 1050 pb),     -   the primers R910S (SEQ ID NO:41) and R910AS (SEQ ID NO:42), for         the amplification of the region G910 (approximately 1400 pb) of         the isolate dFr910,     -   the primers S1910R (SEQ ID NO:43) and AS1910R (SEQ ID NO:44),         for the amplification of the region p910 (approximately 650 pb)         of the isolate dFr910.

A subject of the present invention is also a method for detection and for genotyping of HDV from a biological sample, which method is characterized in that it comprises:

-   -   (a) a step consisting in extracting the nucleic acid belonging         to the genome of the HDV virus,     -   (b) a step consisting in amplifying the region R0 delimited by         position 889 to position 1289 of the HDV genome,     -   (c) a first treatment of the amplified nucleic acid molecules         with the SmaI and XhoI restriction enzymes, so as to produce a         first set of restriction fragments, and     -   (d) a second treatment of nucleic acid molecules with the SacII         restriction enzyme, so as to produce a second set of restriction         fragments,     -   (e) the combined analysis of the two sets of restriction         fragments produced by RFLP (Restriction Fragment Length         Polymorphism), so as to detect the presence and/or to determine         the type of HDV present in said biological sample.

According to an advantageous embodiment of said method, the amplification step (b) is advantageously carried out with the primers 900S (SEQ ID NO:33) and 1280AS (SEQ ID NO:34).

The method according to the invention makes it possible to define new restriction profiles and to classify the HDVs into seven distinct genotypes.

According to another advantageous embodiment of said method, it also comprises:

-   -   (f) amplification of the nucleic acid molecules of said sample         by RT-PCR with the primers 900S (SEQ ID NO:33) and 320AS (SEQ ID         NO:45), so as to amplify the R2 region, and     -   (g) direct sequencing of the amplified R2 region and comparison         with one of the RNA molecules of sequences SEQ ID NOS: 1, 6, 11,         16, 21 and 26, corresponding respectively to the complete         genomes of the isolates or variants referred to, respectively,         as dFr45, dFr47, dFr73, dFr910, dFr48 and dFr644, for example by         phylogenetic analysis.

When unusual profiles are observed, this additional step makes it possible to characterize new genotypes. Specifically, these analyses complementary to the PCR-RFLP make it possible to link the new profile observed to a given genotype, or to characterize a new genotype, by phylogenetic analysis.

A subject of the present invention is also a recombinant vector, in particular a plasmid, comprising an insert consisting of a nucleic acid molecule as defined above.

A subject of the present invention is also a cell transformed with a nucleic acid molecule as defined above.

A subject of the present invention is also translation products encoded by one of the RNA molecules of sequences SEQ ID NOS: 1, 6, 11, 16, 21 and 26 corresponding respectively to the complete genomic RNAs of the isolates or variants referred to, respectively, as dFr45, dFr47, dFr73, dFr910, dFr48 and dFr644, or by the sense or antisense sequences complementary thereto.

A subject of the present invention is also the proteins encoded by the genome of a variant HDV as defined above.

According to an advantageous embodiment of the invention, said protein is selected from the group consisting of:

-   -   the LHD protein of dFr45, dFr47, dFr73, dFr910, dFr48 and dFr644         which exhibit, respectively, the sequences SEQ ID NOS: 3, 8, 13,         18, 23 and 28, and     -   the sHD protein of dFr45, dFr47, dFr73, dFr910, dFr48 and dFr644         which exhibit, respectively, the sequences SEQ ID NOS: 5, 10,         15, 20, 25 and 30.

A subject of the present invention is also a peptide characterized in that it consists of a fragment of a protein as defined above, selected from the group consisting of:

-   -   peptide A consisting of the 19 amino acids of the         carboxy-terminal end of the sequences SEQ ID NOS:3, 8, 13, 18,         23 and 28,     -   peptide B of sequence (one-letter code) RLPLLECTPQ (SEQ ID         NO:59) consisting of the 10 amino acids of the carboxy-terminal         end of the sequences SEQ ID NOS:3, 8, 13, 18, 23 and 28, and     -   peptide C consisting of the 9 amino acids preceding the sequence         SEQ ID NO:59 (SEQ ID NO:60 to SEQ ID NO:65).

Such peptides are useful for the indirect diagnosis (serology) of an HDV infection, in particular by an immunoenzymatic method (ELISA):

-   -   peptide B, which is conserved, makes it possible to detect all         the variants according to the invention and HDV genotype II, and     -   peptide C is specific for the various HDV variants according to         the invention.

A subject of the present invention is also the use of a nucleic acid molecule as defined above or of a protein as defined above, for preparing a kit for detecting and genotyping an HDV.

Besides the above arrangements, the invention also comprises other arrangements which will emerge from the following description, which refers to examples of implementation of the present invention and also to the attached drawings in which:

FIG. 1 represents the phylogenetic tree of the R0 region, obtained by the neighbor-joining method. The numbers in italics indicate the bootstrapping values (BVs) on 10⁴ re-samplings and the sign π indicates the BVs<50%. The scale represents the number of nucleotide substitutions per site.

FIG. 2 represents the phylogenetic tree of the R0 regions of HDV, obtained by the maximum parsimony method. The Peru1, Peru2 and Columbia isolates are chosen as “outgroup”. The figures in italics indicate the bootstrapping values (BVs) on 10⁴ re-samplings,

FIG. 3 illustrates the clinical data from each of the six patients infected with the HDV isolates of African origin. * indicates, respectively, the 6S/6As PCR and R0 PCR,

FIG. 4 represents the phylogenetic tree of the complete genomes of HDV, obtained by the neighbor-joining method. The numbers in italics, at each node, indicate bootstrapping values (BVs) on 10⁴ re-samplings. The scale represents the number of nucleotide substitutions per site.

FIG. 5 represents the phylogenetic tree of the complete genomes of HDV, obtained by the maximum parsimony method. The numbers in italics, at each node, indicate the bootstrapping values (BVs) on 10⁴ re-samplings,

FIG. 6 represents alignment of the amino acid sequences of the delta proteins of the six isolates of African origin (lines 7, 8, 9, 10, 11 and 12-SEQ ID NOS:74-79) with the known sequences of genotype I (lines 13, 14 and 15-SEQ ID NOS:80-82), genotype II (lines 3, 4, 5 and 6-SEQ ID NOS:70-73), genotype III (line 16-SEQ ID NO:83) and TW2b/Miyako (lines 1 and 2SEQ ID NOS:68-69), using the Clustal program (version 1.8). The amino acid at position 196 of the p27 protein corresponds to the termination codon of the p24 protein (Z) or to the tryptophan codon (W) which results in the synthesis of the p27 protein which extends from amino acids 1 to 215.

It should be clearly understood, however, that these examples are given only by way of illustration of the subject of the invention, of which they in no way constitute a limitation.

EXAMPLE 1 Materials and Methods

1—Patients and Samples

22 sera originating from individuals monitored in various hospital centers of the parisian region were analyzed. The patients were chronic HBs-Ag carriers. Diagnosis of the delta infection was performed by searching for serological markers (HD-Ag, IgM- and IgG-type anti-HD-Ag) and detection of the HDV viral genome by RT-PCR. HD-Ag was not detected in any of the sera analyzed. IgM-type anti-HD-Ag antibodies, reflecting the chronic nature of the delta infection, and IgG-type antibodies were found in all the patients. The entire HDV genome was characterized in six of the patients. All the sera were conserved at −80° C. until extraction of the viral RNA.

2—HDV RNA Extraction

To extract the HDV RNA, a 250 μl volume of serum was added to 75 μl of TRIzol LS Reagent (Gibco BRL, Life Technologies). After homogenization for 30 seconds, the mixture was incubated for 5 min at ambient temperature. Lipid extraction was carried out by adding 200 μl of chloroform cooled to +4° C. After a further homogenization with a vortex, the tubes were incubated and then centrifuged at 14 000 rpm for 10 min at +4° C. The aqueous phase was transferred into extraction tubes and the RNAs were precipitated with 500 μl of cold isopropanol, in the presence of 1 μg of glycogen. After homogenization for 15 min, the samples were centrifuged at 14 000 rpm for 10 min at +4° C. After rinsing with 70% ethanol, the tubes were again centrifuged at 14 000 rpm for 10 min at +4° C. The pellets were dried under a hood at ambient temperature, and then taken up in 100 μl of sterile water comprising a ribonuclease inhibitor (RNasin, Promega). At this stage, precautions were taken to avoid possible contamination of the buffers and of the samples with ribonucleases.

3—Synthesis of a Complementary DNA (cDNA)

This step consists in synthesizing a DNA strand complementary to the HDV RNA by reverse transcription.

In order to eliminate the secondary structure of the HDV RNA, 5 μl of previously extracted RNA were added to a reaction mixture containing 5 μl (or 0.5 pmol) of deoxynucleotide triphosphates (dNTPs) and 1 μl (0.4 pmol) of random hexanucleotides. The RNAs were then denatured for 3 min at 95° C. In order to fix the denatured RNAs, the tubes were immediately frozen in ethanol cooled to −20° C. Ten μl of a reaction mixture, containing 2.5 μl of dithiothreitol (DTT), 100 units (U) of Superscript II reverse transcriptase (Gibco BRL, Life Technologies) and its reaction buffer and also 20 U of ribonuclease inhibitor (RNasin, Promega) were added to the denatured RNA. The reverse transcripion reaction was carried out at 42° C. for 45 min and then stopped by incubation at 94° C. for 5 min. The cDNAs were then conserved at −80° C.

4—Gene Amplification

The cDNA amplification is carried out, exponentially, by PCR (Polymerase Chain Reaction). Two types of polymerases were used: AmpliTaq Gold polymerase (Thermophilus aquaticus) (PE Applied Biosystems) and Pwo polymerase (Pyrococcus woesi) or the Expand™ High Fidelity PCR system (Roche).

The amplification was carried out using 5 μl of cDNA, which are added to a PCR reaction mixture containing: 0.25 pmol/μl of sense and antisense primer (Table III), 200 μmol of each dNTP, 1.5 mM of MgCl₂, 1 U of AmpliTaq Gold or 2.6 U of Expand™ polymerase in the presence of the corresponding PCR buffers. The PCR reaction was carried out in a thermocycler (PCR Sprint, Hybaid, Coger), according to the following protocol: denaturation of the cDNA—RNA hybrids at 94° C. for 9 min, followed by 40 successive cycles, each comprising denaturation of the DNAs at 94° C. for 45 sec, hybridization of primers (900S/1280As or 6S/6As) at 58° C. for 30 sec, synthesis of the complementary strand, using the polymerase, by elongation at 72° C. for 45 sec. Finally, a final elongation at 72° C. for 4 min 30 sec at 72° C.

TABLE III Sequences of the primers used for the PCR reactions and their position on the HDV genome identification positions Primers 5′ → 3′ sequence number * 6S gaggaaagaaggacgcgagacgcaa SEQ ID NO:31 904-929 6AS accccctcgaaggtggatcga SEQ ID NO:32 1141-1121 900S catgccgacccgaagaggaaag SEQ ID NO:33 889-911 1280AS gaaggaaggccctcgagaacaaga SEQ ID NO:34 1289-1265 318S ctccagaggaccccttcagcgaac SEQ ID NO:35 305-328 1150AS cccgcgggttggggatgtgaaccc SEQ ID NO:36 1161-1138 960S gtacactcgaggagtggaaggcg SEQ ID NO:37 962-984 345AS tctgttcgctgaaggggtcct SEQ ID NO:38 331-311 320S ccagaggaccccttcagcgaac SEQ ID NO:39 307-328 420AS aacaccctcctgctagcccc SEQ ID NO:40 446-427 R910S ccggagttcctcttcctcctcc SEQ ID NO:41 1206-1227 R910AS gttcgcgtcegagtccttettte SEQ ID NO:42 929-907 S1910R gagctttcttcgattcggac SEQ ID NO:43 1531-1550 AS1910R gactggtcccctcatgttcc SEQ ID NO:44 572-553 * According to the numbering of Wang et al. (Nature, 1986, 323, 508-514; Nature, 1987, 328, 456) 4.1—Strategy for Amplifying the HDV Viral Genome

The pair of primers 6S and 6AS makes it possible to amplify a DNA fragment corresponding to the carboxy-terminal end of the gene encoding the delta antigen.

The R0 region comprising the carboxy-terminal end of the gene encoding the HD-Ag and a portion of the noncoding region was amplified for all the samples using the primers 900S (SEQ ID NO:33) and 1280AS (SEQ ID NO:34). The primer 900S used had 7 nucleotides deleted at the 5′ end, compared to that used by Casey et al., 1993a, mentioned above for the classification of the HDV genotypes.

The selection of these primers makes it possible, surprisingly, to amplify a fragment which makes it possible to distinguish the known genotypes (I, II and III) from new genotypes.

The complete sequences of the HDV viral genome of four samples (dFr45, dFr47, dFr48 and dFr73) were obtained by amplification of two overlapping regions R′1 (850 bases) and R′2 (1050 bases), respectively, using the pairs of primers 318S/1150AS and 960S/345AS. For the dFr644 sample, the variability observed in the region corresponding to the primers described above led to the 644 region (R644) being amplified using a specific pair of primers: 900S and 480AS.

For the dFr910 sample, the R0 nucleotide sequence made it possible to define new primers specific for the sample in order to amplify the complete genome. Two pairs of primers were chosen: the primers R910S and R910AS, which amplify a 1400 base fragment corresponding to the G910 region. Another pair of primers, S 1910R and AS 1910R, which amplify a 650 base fragment (p910 region), was essential for covering the entire genome.

The amplification of the various regions (R1, R2, R3, R644, R′1, R′2, G910 and p910) was carried out as described above. The hybridization and elongation temperatures and also the elongation time used for each of the PCRs are indicated in Table IV.

TABLE IV Amplification of the various fragments of the genome Fragment Hybridization Elongation Amplified size temperature temperature Elongation regions (bases) (° C.) (° C.) time R1 960 62 72 1 min 15 s R2 1100 56 72 1 min 30 s R3 650 50 72 1 min R644 1250 58 72 40 s G910 1400 58 72 1 min 40 s p910 650 58 72 1 min R′1 850 63 72 1 min R′2 1050 60 72 1 min 20 s 5—Analysis of the Amplification Products

An 8 μl volume of the PCR product was loaded, in the presence of 2 μl of loading solution, onto a 1.3% agarose gel prepared in 0.5× Tris-borate/EDTA buffer containing 0.5 μg/ml of ethidium bromide (ETB). Electrophoresis was carried out in 0.5× TBE buffer. The migration was carried out in the presence of a size marker (Raoul™, Appligene). The amplified fragment was visualized under ultraviolet rays at 312 nm and photographed.

6—Cloning and Sequencing of the HDV Genomes

Before the cloning and sequencing step, the amplification products are purified in order to remove all traces of salts and enzymes.

6.1—Elongation with Standard Taq Polymerase

This step is performed when the amplification of the product has been carried out with Pwo polymerase. It makes it possible to add deoxyadenosine (A) residues to the 3′ ends of the PCR products, due to the fact that Pwo polymerase, which has 5′→3′ exonuclease activity, decreases the incorporation of the deoxyadenosines.

A 10 μl volume of purified DNA was added to a 70 μl reaction mixture containing: 0.2 mM of dNTP, 1.5 mM of MgCl₂, 1× buffer and 2.5 U of Taq polymerase (Perkin Elmer). The elongation was carried out at 72° C. for 30 minutes. The PCR products then underwent further purification with phenol-chloroform and precipitation with ethanol, and were then taken up in 10 μl of sterile water.

6.2—Cloning in the pCRII-TA-Cloning Vector (Invitrogen)

Cloning is used to confirm the nucleotide sequence of the amplified DNA. It is carried out using the pCRII vector (Invitrogen).

The pCRII vector is in linear form. It has deoxythymidine (T) residues which allow the amplified product to be cloned by virtue of the complementary deoxyadenosine (A) residues added by the Taq polymerase. It also has the Sp6 and T7 promoter sequences, two EcoRI restriction sites which border the site for insertion of the PCR product, and the ampicillin resistance and kanamycin resistance genes. A fraction of the lacZα gene, encoding β-galactosidase, facilitates the selection of the recombinants by virtue of the color of the colonies. Specifically, the plasmids which have integrated the insert do not express the lacZα gene. The bacterial colonies are then white in the presence of β-galactosidase substrate (X-Gal or 5-bromo-4-chloro-3-indolyl-β-galactoside, Roche) and of an inducer of the gene (IPTG or isopropyl-thio-β-D-galactoside, Roche). Thus, the recombinant bacteria are selected by virtue of their ampicillin resistance and of a blue-white screen.

The chosen insert/vector molecular ratio is 3/1 and the volume of PCR product used is variable, depending on the amount of DNA estimated by agarose gel electrophoresis as described above. The 10 μl reaction mixture contains 50 ng of pCRII vector, the corresponding amount of insert, 4 U of T4 DNA ligase, and the 1× ligase buffer. The ligation reaction is carried out for 18 hours at 14° C. The tubes are then conserved at +4° C.

Escherichia coli TOP10F′ bacteria (Invitrogen), made competent by treatment with calcium chloride are conserved at −80° C., ready for use. A 50 μl volume of competent bacteria is brought into contact with 3 μl of the ligation solution for 30 minutes, in ice. A heat shock (30 sec at 42° C.) causes the plasmid DNA to penetrate into the bacteria, which are immediately placed on ice again for a few minutes, before being incubated for 1 hour at 37° C. in 250 μl of SOC medium (2% tryptone; 10 mM NaCl; 2.5 mM KCl; 10 mM MgCl₂; 20 mM glucose, 5 g/l yeast extract). The colonies are then isolated on Petri dishes containing LB agar (Luria-Bertani medium), supplemented with ampicillin (50 μl/ml), and 40 μl of X-Gal (40 mg/ml) and 40 μl of IPTG (100 mM) are distributed.

6.3—Plasmid Extraction and Insert Analysis

The white colonies are seeded in LB broth-ampicillin (50 μl/ml) and incubated for 18 hours at 37° C., with shaking. A blue colony, i.e. a colony which has not inserted a fragment, is selected as a negative control for ligation.

The plasmid extraction is carried out using a commercial QIAprep® Spin Miniprep kit (Qiagen). Briefly, after centrifugation (3000 rpm at +4° C.) and removal of the supernatant, the bacterial pellet is suspended in 250 μl of buffer (50 mM Tris-HCl, pH 8, 10 mM EDTA, 100 μl/ml RNase A) and lysed by adding 250 μl of alkali buffer (200 mM NaOH, 1% SDS). After homogenization for 5 min, 350 μl of buffer (3M potassium acetate, pH 5.5) are added. The supernatant containing the plasmid DNA is then transferred into a QIAprep column. A centrifugation eliminates the eluate into the collecting tube.

The column is washed with an ethanol buffer and dried, and the DNA is then eluted in 50 μl of sterile water.

To verify the insertion of the fragment of interest, the plasmid is then digested with the EcoRI restriction enzyme. The digestion is carried out in a 30 μl reaction mixture containing: 2 μl of the plasmid solution, 10 U of EcoRI enzyme (Appligene) and 1× reaction buffer. The digestion lasts 2 hours at 37° C. and the result is visualized by agarose gel electrophoresis.

6.4—Sequencing by the BigDye Terminator Method

The sequencing is carried out on the PCR products purified beforehand on Microcon 50 columns (Amicon) or on the plasmid DNA. The fragments are either sequenced directly with the PCR primers (R0 fragment sequenced with the primers 900S and 1280AS), or after cloning in the PCRII vector using universal primers (Sp6 and T7).

Two different clones were selected for each of the amplified regions, in order to remove any possible ambiguities during reading of the nucleotide sequences.

The sequencing was carried out using the BigDye Terminator reagent (PE, Applied Biosystems). The sequencing principle consists of vertical electrophoresis, in a polyacrylamide gel, of the DNA labeled with four different fluochromes. The DNA matrices are loaded onto the gel and separated according to their size, before subjecting the gel to a laser beam continuously. The laser excites the fluochromes, which each emit at a different wavelength, detected by a spectrograph. Computer software, coupled to the sequencer, enables automatic analysis and conversion of the data to nucleotide sequences.

The 10 μl reaction mixture comprises: 4 μl of the labeling solution (deoxynucleoside triphosphates (dATP, dCTP, dGTP, dUTP), AmpliTaq DNA polymerase, MgCl₂, Tris-HCl buffer pH 9), 20 pmol of primer (sense or antisense) and 500 ng of plasmid purified on Centricon columns. The (sense and antisense) sequence reactions are carried out in a Perkin 9600 thermocycler, with 25 cycles (96° C. for 10 sec; 50° C. for 5 sec; 60° C. for 4 min). The products are then precipitated in 40 μl of 70% ethanol, loaded onto gel and analyzed using an automatic sequencer of the ABI PRISM 377 type.

The crude sequences obtained are in the form of electrophoregrams. The sequences are validated and exploited using the Sequence Navigator program (PE, Applied Biosystems). They are the subject of at least one double reading, with two different sequencing primers (sense and antisense), in order to minimize errors.

These sequences are then directly captured on a computer using the DNA Strider 1.3 software for rapid sequence analysis.

7—Computer Analysis of the Nucleotide and Protein Sequences

The read and corrected sequences are compared and subjected to the various phylogenetic algorithms.

The sequences obtained (22 sequences) were compared to 21 complete genomic sequences of HDV available in GenBank (Table V).

TABLE V Accession numbers of the various isolates Accession number (GenBank) Isolate name Geographical origin  1 X04451 Italy 1 (A20) Italy  2 M84917 Lebanon I Lebanon  3 X85253 PatientA. Cagliari (Italy)  4 X60193 Jul. 18, 1983 (patient s) (patient S) Japan  5 M92448 Taiwan Taiwan  6 L22061 Columbia Columbia  7 X77627 Chinese human serum Central China  8 L22064 Peru-2 Peru  9 L22063 Peru-1 Peru 10 L22066 US-2 United States 11 M58629 Nauru Island of Nauru 12 U81988 Somalia Somalia 13 U81989 Ethiopia 1 Ethiopia 14 AF098261 Canada Canada (Quebec) 15 U19598 Taiwan 3 Taiwan 16 AF018077 TW2b Taiwan 17 L22062 Japan 3 Japan 18 AF309420 Miyako Island of Miyako (Okinawa, Japan) 19 D01075 US-1 United States 20 M21012 W15 Experimental transmission (marmot) 21 AJ307077 W5 Experimental transmission (marmot) 22 AJ309868 Yakutia isolates Yakutia (Russia) to AJ309881

The first step consists overall in aligning the sequences of interest with the reference HDV sequences described and listed in the databank (Genbank), using the CLUSTAL W1.8 program (Thompson et al., N.A.R., 1994, 22, 4673-4680). Minor manual corrections were sometimes necessary using the SeqPup program in order to optimize the alignment.

Two approaches were followed: the use of protein alignment for the HD gene and the study of the stability of the aligned positions using an appropriate alignment program.

Based on this nucleotide sequence alignment, phylogenetic trees are constructed using various algorithms. The analyses are based on the distance matrices (phenetic approach), calculations of maximum parsimony (MP; cladistic approach) and calculations of maximum likelihood (ML; statistical approach).

Phenetic Approach (Genetic Distance)

The principle of this method is to find pairs of neighboring sequences, minimizing the total length of the branches of the tree. This approach makes it possible to reconstruct a phylogeny on the basis of calculating the overall similarity between the sequences compared two by two, which is expressed by virtue of a distance. It is a method which makes it possible to convert the sequence data into numerical values of distances, arranged in a matrix. The topology of the tree is constructed so as to group together the sequences which have most characters in common using one of the grouping methods such as the neighbor-joining method (Saitou et al., 1987).

Cladistic Approach (Maximum Parsimony)

The principle of this method consists in establishing whether sequences are related by searching for shared nucleotide bases, minimizing genetic events. The maximum parsimony algorithm constructs a phylogenetic tree in such a way that it involves a minimum of mutations. The tree selected is that which requires the least change. This method is sensitive to the differences in degree of mutation along the branches. The “clades” or “monophyletic groups” consist of the groups of sequences sharing a common ancestor, excluding any other sequence.

Statistical Approach

The maximum likelihood method is considered to be a statistical approach. The program calculates the probability that a sequence will evolve toward another over time. In other words, it consists in considering the changes at each site or character as independent probability events. This likelihood algorithm is cumulative over all the sites, and the sum is maximized in order to estimate the branch length of the tree. This method requires a long calculation period in order to search for the most likely phylogenetic tree corresponding to the sequences observed, due to the fact that it takes into account the probability of change of each character.

All the phylogenetic analyses were carried out using the Phylip 3.75 (PHY Logenetic Inference Package) (Felsenstein et al., 1989) and Paup * version 4.0beta6 (Phylogenetic Analysis Using Parsimony)(Swofford et al., 1998) computer programs.

The distance analysis was calculated by the two-parameter Kimura method, which considers the transition rate (mutations T<->C and G<->A) at each site and the transversion rate (mutations “A or G”<-->“T or C”) at each site to be different.

The reliability and the robustness of the sequence groups (or of the topologies) are evaluated statistically by the resampling (or bootstrap) approach on 10³ and 10⁴ resamplings.

The results obtained are in the form of a phylogenetic tree visualized using the Treeview program (version 1.6.5), proposing various presentations of the tree (cladogram, radial and phylogram). It also makes it possible to visualize the bootstrap values at each node and to determine a taxon as an outgroup (sequences of genotype III).

Translation of the delta gene to amino acids is carried out using the DNAStrider version 1.3 program. The protein sequence alignment is carried out as described above.

8—Genotypic Analysis of HDV by Restriction Polymorphism (RFLP)

The HDV is genotyped by PCR-RFLP of the R0 region, according to the following steps:

-   -   Step 1: The PCR products are digested with the two restriction         enzymes SmaI and XhoI (New England Biolabs): 10 μl of amplified         product are digested separately in two tubes with 5 U of SmaI or         XhoI enzyme, respectively at 30° C. and at 37° C., for 3 hours         in a final volume of 50 μl in the presence of the appropriate         buffer and of sterile water. The digestion products are         visualized under ultraviolet rays as described above and the         fragment sizes are determined by comparison with a size marker         (50 pb DNA ladder, or the V and VI. markers, Life Technologies         GibcoBRL).     -   Step 2: The samples exhibiting a profile other than the genotype         I profile are digested with another enzyme, SacII (New England         Biolabs), for 3 hours at 37° C. and the digestion products are         visualized as in step 1.     -   Step 3: The genotype of the virus is determined based on         analysis of the combination of the SmaI, XhoI and SacI         restriction profiles.         9—Algorithm for Genotyping HDV by PCR-RFLP

The algorithm for genotyping HDV by PCR-RFLP comprises at least two steps:

-   -   the first consists of cleavage, with two restriction enzymes,         SmaI and XhoI, of the R0 fragment amplified by RT-PCR from the         RNAs extracted from the sera of the patients;     -   the second for the patients of “non-I profile”, consists of         cleavage of R0 with the SacII enzyme;     -   sequencing of the R0 region or of the region encoding p24 (or,         if necessary, of the entire genome), followed by phylogenetic         analyses will only be carried out as a backup if unusual         restriction profiles are obtained.

EXAMPLE 2 Demonstration of New HDV Genotypes

1—Phylogenetic Analysis of the R0 Region

22 samples from patients infected with HBV and HDV were analyzed. The R0 region was amplified by PCR and the fragment obtained was then sequenced using the primers 900S and 1280AS.

The phylogenetic study was carried out using alignment of 336-base sequences of R0 (the ambiguous regions are eliminated), including therein, in addition to the 22 sequences studied, 15 reference sequences and 6 R0 sequences from Yakutia HDV (Pt13, 26 (SEQ ID NO:66), 29, 62 (SEQ ID NO:67), 63 and 704). The name given to the sequences corresponds to dFr (for “delta France”) followed by the patient serum number.

a) Genetic Distance Analysis

The phylogenetic tree obtained by reconstruction using genetic distances of the R0 region is given in FIG. 1.

The topology of the tree individualizes genotypes I and III, represented respectively by seven and three reference nucleotide sequences. The other reference sequences are represented by the type II sequences (Japan, Taiwan-3 and Yakutia sequences), and a group of two sequences (TW2b, Miyako) each described respectively as prototype of “subtypes IIB and IIC”.

This tree shows that the viral sequences originating from the 22 samples analyzed correspond to two situations:

-   -   11 sequences are affiliated with the genotype I sequences, with         the exception of the sequence dFr46, which appears to be related         to the sequence US-1 described by Makino (Makino et al., 1987);         all these sequences are distributed heterogeneously within         genotype I;     -   the remaining 11 sequences are very far removed from genotype I         and from genotype III. In addition, none of these sequences is         directly grouped together with the type II sequences (Japan,         Taiwan-3, Yakutia 13, 26, 29, 62, 63, 704) or with the (TW2b,         Miyako) sequence group; these reference sequences form on their         own two distinct groups.

The topology of the tree obtained by reconstruction using the genetic distances of the R0 region shows that the nucleic acid molecules isolated from the various variant HDVs are distributed within four subgroups (FIG. 1):

-   -   the dFr644 molecule, which appears to be isolated; it possesses,         however, with a group of three molecules (dFr45, dFr2066 and         dFr1843), a node which is supported for a bootstrap value of         only 66.7%;     -   on the other hand, the branch which unites the dFr45, dFr2066         and dFr1843 molecules is robust, since it is supported by a         bootstrap value (BV) of 99.9%;     -   a set of five molecules: dFr47, dFr910, dFr69, dFr73 and dFr1953         is supported by a BV of 100% and     -   a pair of molecules dFr48 and dFr2020, which is also supported         by a BV of 100%.         b) Maximum Parsimony Analysis

The phylogenetic tree obtained by reconstruction using the maximum parsimony of the R0 region is given in FIG. 2.

The maximum parsimony analysis supports the same topology as the genetic distance analysis. The reconstruction demonstrates the existance, within the 11 variant sequences, of the same three monophyletic groups; for example, with this approach, the group of five molecules dFr47, dFr910, dFr69, dFr73 and dFr1953 is also supported by a BV of 97% (FIG. 2).

The 11 variant molecules, the genotype II molecules and the [TW2b, Miyako] set appear to derive from a common branch which could, by comparison with the genotype I and genotype III sequences, individualize all the genotype II sequences. However, the bootstrap values supporting this branch are relatively moderate: 88.5% by NJ and 64.5% by MP (resampling carried out on 10⁴ samples) compared with those of genotype I (BV=99.8%) and genotype III (BV=100%). In addition, the average distance between the various subgroups defined within the 11 variant HDVs or between these variants and the genotype II sequences appears to be higher than between all the genotype I isolates or within the three molecules defining genotype III.

All these results emphasize the characterization of new HDV genotypes.

2—Phylogenetic Analysis of all the Genomes

a) Reconstruction of the Complete Genome from Amplified Fragments

In order to study the complete genome of these variants, and with the aim of specifying their affiliation, several regions of the HDV genome were amplified (Table II) from 6 samples including at least one member of each of the 4 subgroups and three representative of the major group were selected: dFr45, dFr47, dFr48, dFr73, dFr644 and dFr910.

More precisely, the following fragments were amplified by PCR (Table IV):

-   -   fragments of 850 pb (R′1) and 1050 pb (R′2) overlapping at their         ends for dFr45, dFr47, dFr48 and dFr73,     -   two overlapping fragments of 960 bp and of 1250 bp for dFr644,         and     -   two fragments of 1400 pb and 650 pb for dFr910.

All these amplified genomic regions were cloned into a vector PCRII™ (Table VI). Two clones corresponding to each of the amplified fragments were sequenced. Reconstruction of complete consensus HDV cDNA sequences was carried out after alignment of the overlapping regions and alignment with the reference sequences.

TABLE VI pCRII clones containing the various inserts R0 R′1 R′2 G910 R1 dFr45 — dFr45R′1 dFr45R2 — — clone 2  clone 8  dFr45R′1 dFr45R2 clone 4  clone 10 dFr47 — dFr47R′1 dFr47R2 — — clone 13 clone 19 dFr47R′1 dFR47R2 clone 16 clone 22 dFr48 — dFr48R′1 dFr48R2 — — clone 23 clone 19 dFr48R′1 dFr48R2 clone 28 clone 22 dFr73 — dFr73R′1 dFr73R2 — — clone 36 clone 29 dFr73R′1 dFr48R2 clone 39 clone 33  dFr644 — — — — 644R1 clone 4 644R1 clone 8  dFr910 910R0 — — R910 910R1 clone 4 clone 29 clone 4 910R0 R910 910R1 clone 4 clone 31 clone 5 b) Analysis of Six New Complete HDV Genomic Sequences of African Origin

-   -   b₁) Clinical Characterization of 6 Patients

Five patients originate from West Africa, and one patient has spent time in Cameroon. At the time samples were taken, these patients had been residing in the parisian region for at least two years. All these patients were suffering from severe hepatitis and the clinical data are summarized in FIG. 3.

-   -   b₂) Genomic Organization of the New HDV Sequences

Comparative analysis of the R0 regions of 22 patients infected with HDV and HBV with those available in the databases demonstrated the great genetic diversity of the HDV viral genome.

The size of the complete genomes is different for the six sequences of the six HDV isolates of African origin, which confirms the variability of HDV:

-   -   the viral genome of the dFr910, dFr47 and dFr73 isolates,         comprising 1697 nucleotides, is the longest ever described for         HDV;     -   the genome of the dFr45 isolate appears to be among the smallest         (1672 nt), and     -   the genomic sequences of the dFr644 and dFr48 viruses are,         respectively, 1680 nt and 1687 nt.

The analysis after alignment of the various sequences studied reveals a high degree of conservation in the regions of the HDV genome corresponding to the ribozymes responsible for cleavage of the genomic and antigenomic RNAs. Similarly, the reading frame encoding the delta antigen is found on the antigenomic strand. A tryptophan codon (UGG) is the only one to be characterized for two sequences (dFr47, dFr910), and an ambiguity (G/A) found for the other four sequences indicates that the small delta protein and the large delta protein are very probably synthesized. The variable regions comprise the noncoding portion and also the 5′ and 3′ ends of the LHD gene. Notably, an insertion of 7 nucleotides exists in the dFr48 sequence. This insertion is present in a loop corresponding to one of the ends of the genome in its pseudo-double-stranded form (at position 797 of the Italy reference sequence (Wang et al., 1987)).

c) Comparison of the Six HDV Sequences of African Origin with the Sequences Representative of the Various Genotypes

Comparison of the six new molecules with the known molecules, representative of the three known genotypes, indicates a nucleotide similarity of between 71.7% (dFr45 versus Lebanon) and 80.0% (dFr73 versus Yakutia p26) with regard to the genotype I and II molecules and the TW2b and Miyako molecules. Specifically, for each of the six isolates, the mean nucleotide similarity is of the order of 73.3% to 74.6% with the genotype I molecules, of 74.5% to 78.8% with those of genotype II and of the order of 74.6% to 77.8% with the Tw2b/Miyako molecules. On the other hand, the nucleotide similarity with the Peru isolate (genotype III) is only 63.9% to 66.0%, confirming the particularly distant nature of this molecule (Table VII). In addition, when the six molecules corresponding to these complete genomes and defining the six variants dFr4S, dFr47, dFr48, dFr73, dFr644 and dFr910 are compared with one another, only the group of molecules dFr73, dFr910 and dFr47 exhibits a sequence similarity of the order of 90%. The dFr45, dFr48 and dFr644 molecules are as distant from one another as they are from genotypes I and II, from the TW2b/Miyako sequences and from the group of molecules dFr73, dFr910 and dFr47 (of the order of 73.2% to 78%) (Table VIII).

TABLE VII Percentage similarity of the complete African HDV sequences with the various known genotypes (calculation of the mean) HDV isolate* Type I Type II TW2b/Miyako Type III dFr45 73.3 74.5 74.6 66 71.7-74.6 73.2-75.5 dFr47 74.2 78.6 77.4 65.5 73.0-75.0 78.2-79.9 dFr48 73.3 77.1 75.5 65.4 72.0-74.0 76.6-77.7 74.4-76.6 dFr73 74.1 78.8 77.8 65.9 73.0-75.0 77.7-80.0 77.5-78.0  dFr644 73.6 76.8 77.0 63.9 72.2-74.6 76.2-77.2 76.9-77.2  dFr910 74.6 77.9 77.2 64.6 73.0-75.8 77.0-78.6 77.0-77.5 *The reference HDV isolates correspond to the complete genomes studied in Example 2.1.

TABLE VIII Percentage similarity of the new HDV molecules with one another dFr47 dFr48 dFr73 dFr644 dFr910 dFr45 74.8 73.2 75 78 74.7 dFr47 77.1 90 76.3 89 dFr48 77.7 75.5 76.1 dFr73 76.3 89  dFr644 76.1 d) Phylogenetic Analysis of the Six HDV Molecules of African Origin and of the Molecules Representative of the Various Genotypes

The phylogenetic analysis was carried out on the six complete sequences of African origin, sixteen reference sequences and two Yakutia sequences (Pt26 and Pt62). FIG. 4 illustrates the results obtained by distance analysis. The phylogenetic tree reconstructed by neighbor joining (NJ) shows that none of the six sequences studied (dFr45, dFr47, dFr48, dFr73, dFr644 and dFr910) is grouped together with the genotype I or genotype III reference sequences. The affiliation of these African sequences with the genotype II sequences (with the TW2b and Miyako sequences described, respectively, as subtypes IIB and IIC) is not supported by high bootstrap values (<70%) (Wu et al., 1998). In addition, the TW2b and Miyako sequences appear to form a distinct and monophyletic group with a BV of 100%. These two sequences appear to constitute on their own a “clade” representing a genotype different from type II.

In the distance analyses, the six African sequences are subdivided into 3 distinct subgroups (supported by BVs of greater than 90.3% for 10⁴ resamplings). The dFr47, dFr73 and dFr910 sequences constitute a group whose branch is based on a bootstrap value of 100%. To support these results, the maximum parsimony study was carried out on the same set of sequences (FIG. 5). By routing the tree artificially using the “Peru-1” sequence, all the sequences of genotype I are individualized (BV=100%), as in all the analyses carried out above. The topology of the other sequences supports distribution of the African and Asian isolates in several groups; this shows the value of using the R0 region. Genotype II groups together the Yakutia, Taiwan-3 and Japan sequences with a BV of 99.9% on 10⁴ resamplings. Similarly, the individualization of TW2b and Miyako is confirmed (BV=100%). Finally, the African sequences indicate the existence of at least 3 subgroups. The monophilicity of the dFr47, dFr73 and dFr910 sequences (BV=100%) supports the affiliation of these sequences in a subgroup. On the other hand, the dFr48 sequence, which possesses, with the isolates of the preceding group (dFr910, dFr47, dFr73), a respective nucleotide similarity of 76.1, 77.1 and 77.7%, is grouped together with these sequences in only 55.4% of resamplings, suggesting its possible individualization. Although appearing to be distant from one another, the dFr45 and dFr644 group is observed with a high BV (NJ=96.5/MP=88.6) in the context studied.

Consequently, the phylogenetic analyses of both the R0 regions and the complete sequences of the African sequences indicate that the groups differ from one another and could constitute three (or even four) distinct genotypes; these results thus demonstrate the existence of at least seven HDV genotypes.

3—Analysis of the Amino Acid (aa) Sequence of the Delta Antigen (HD-Ag)

The HD-Ag is represented by the two forms p24 (sHD) and p27 (LHD) of the delta protein. The protein sequence of 1 to 194-195 amino acids corresponds to the small delta protein (sHD) or p24 form. The large delta protein (LHD) or p27 form has the same amino-terminal end and an extension of 19 to 20 amino acids at its carboxy-terminal end.

The alignment of the sequence of the HD antigen of the six African sequences with the known HD protein sequences is given in FIG. 6.

Analysis of the sequences shows that the six isolates of African origin have an amino acid identity of the order of 69 to 77% with the genotype I sequences, of 71 to 79% with the genotype II isolates, of 72 to 78% with the TW2b/Miyako sequences, and of 63% with the Peru isolate (genotype III).

The size of the proteins corresponding to the new isolates ranges between 213 and 214 amino acids. All these proteins have the same hydrophobicity profile. The p24 form has two small hydrophobic regions, one located in the region of amino acids 50-60 (between the polymerization site and the NLS) and the other between positions 160 and 172 (opposite an extremely conserved unit). Two other domains are well conserved in the various genotypes: they are the RNA-binding domain and the nuclear localization domain. Just like what has been described in the literature, the carboxy-terminal end of the delta protein (between amino acids 195 and 215) constitutes a hypervariable region. Only two amino acids out of the 19-20 are conserved. They are the cysteine (C) corresponding to the farnesylation site of the large form of the HD protein, and the carboxy-terminal glycine (G). In addition, the signature sequences specific for the isolates of the same genotype, for example the 19 amino acids specific for the large protein of genotype I or the 20 amino acids of genotype III, are found.

On the other hand, for the protein sequences of the isolates of African origin, and of the genotype II and TW2b/Miyako isolates, the carboxy-terminal end appears to be subdivided into two domains. The variable domain is represented by amino acids 197 to 205 and the conserved domain ranges from amino acids 206 to 215 (RLPLLECTPQ)(FIG. 5).

4—Definition of 7 HDV Clades

Analysis of the complete sequences of the six African isolates makes it possible to define seven HDV clades corresponding to the following genotypes (Table IX):

TABLE IX Clade/genotype correspondance Clade Genotype Isolate 1 I Italy, W5, W15, US1, US2, Lebanon, Ethiopia, Somalia, Island of Nauru, China, Cagliari, Canada, etc. 2 IIA Japan, Taiwan3, Yakutia26, Yakutia 62 3 III Peru 1 4 IIB, IIC TW2b, Miyako 5 ? dFr910, dFr73, dFr47 6 ? dFr48 7 ? dFr45, dFr644

EXAMPLE 3 Method for Genotyping HDV-1 to HDV-7 by PCR-RFLP

The genotyping is carried out according to the protocol described in Example 1.8.

1—Lack of Sensitivity of the 6A/6S PCR

Initially, three HB-Ag-positive patients posed a delta infection diagnostic problem. In fact, in these patients, severe hepatitis associated with the presence of anti-HDV IgM is observed, but a lack of HDV replication by RT-PCR using the primers 6A-6S described in Deny et al. (1991, 1993, 1994, mentioned above) for the routine diagnosis of HDV infection. The 6A/6S PCR amplifies 234 pb cDNA fragment corresponding to the carboxy-terminal end portion of the LHD gene (position 904 to position 1141 on the viral genome).

The RNAs extracted from the serum of these same patients were reamplified using the pair of primers 900S and 1280AS defining the R0 region.

The results obtained using the samples from these three patients demonstrated the reproducible presence of a 400 pb band (R0) with the primers 900S and 1280AS, whereas the 6A-6S PCR remained negative.

These results were confirmed on a series of serum samples from patients which were analyzed in parallel with the pairs of primers 6A-6S and 900S-1280AS. Out of 286 samples, 14 were positive only with the R0 PCR.

These results demonstrate greater specificity and better sensitivity of the primers 900S and 1280AS, compared with the primers 6S and 6A, for detecting HDV RNA in the serum of infected patients.

2—Restriction Profiles Expected for HDV-1 to HDV-7

The PCR-RFLP methods conventionally used (Wu et al., 1995a; Wu et al., 1995b; Casey et al., 1996) make it possible to distinguish three different delta genotypes. Use of the SmaI restriction enzyme does not differentiate all the genotypes I, IIA and IIB recognised to date, and the XhoI enzyme was used to differentiate “subtype IIA” from “subtype IIB” (Wu et al., 1995b).

Combining the two enzymes SmaI and XhoI in a first step reveals seven distinct profiles (from P1 to P7) (Table X). These seven profiles do not superimpose exactly on the seven clades (HDV-1 to HDV-7). Consequently, the samples of “non-P1” profile are cleaved in a second step with the SacII enzyme, thus resulting in the obtaining, in a combined manner, of ten distinct delta profiles (from D1 to D10)(Table XI) which can be linked specifically to the various clades described, by virtue of the phylogenetic analyses.

TABLE X Restriction profiles, cleavage of the R0 region with the SmaI and XhoI enzymes SmaI- STEP 1 SmaI SmaI XhoI XhoI XhoI Genotypes fragments pro- fragments pro- combined described Size (pb) file Size (pb) file profile I 220, 179 S1 383, 16 X1 S1 X1 P1 IIA 397 S2 303, 78, 16 X2 S2 X2 P2 IIB 397 S2 319, 79 X3 S2 X3 P3 IIC 397 S2 157, 162, 79 X4 S2 X4 P4 (Miyako) III 298, 107 S3 405 X5 S3 X5 P5 II Yakutia 178, 117, 110 S4 303, 78, 16 X2 S4 X2 P6 dFr45 217, 179 S1 303, 78, 16 X2 S1 X2 P7 dFr644 217, 179 S1 303, 78, 16 X2 S1 X2 P7 dFr47, 73 179, 111, 107 S4 303, 78, 16 X2 S4 X2 P6 910 dFr48 397 S2 303, 78, 16 X2 S2 X2 P2

TABLE XI Restriction profiles expected after cleavage of the R0 region with the SmaI, XhoI and SacII enzymes SmaI- STEP 2 SacII XhoI/SacII Genotypes fragments SacII combined described Size (pb) profile profile I 362, 38 Sc1 S1 X1 Sc1 D1 IIA 266, 92, 38 Sc2 S2 X2 Sc2 D2 IIB 268, 130 Sc3 S2 X3 Sc3 D3 IIC (Miyako) 268, 130 Sc3 S2 X4 Sc3 D4 III 405 Sc4 S3 X5 Sc4 D5 II Yakutia 266, 92, 38 Sc2 S4 X2 Sc2 D6 dFr45 268, 130 Sc3 S1 X2 Sc3 D7 dFr644 397 Sc4 S1 X2 Sc4 D8 dFr47, 73 268, 130 Sc3 S4 X2 Sc3 D9 910 dFr48 268, 130 Sc3 S2 X2 Sc3  D10 3—Genotyping of the Samples from Patients by PCR-RFLP

Based on the PCR-RFLP analysis of samples (more than 50):

-   -   no genotype II or III was found.     -   89.7% of the patients exhibited a D1 profile (genotype I) and         10.3% exhibited a “non-I” profile,     -   two new XhoI profiles (X6 and X7) resulting in three new         additional combinations (D11, D12 and D13) were detected (Tables         XII and XIII).

TABLE XII New XhoI restriction profiles obtained from five patients originating from West Africa SmaI XhoI SmaI-XhoI STEP 1 fragments SmaI fragments XhoI combined PATIENTS Size (pb) profile Size (pb) profile profile dFr1843 218, 179 S1 303, 78, 16 X2 S1 X2 P7 dFr1953 218, 179 S1 303, 78, 16 X2 S1 X2 P7 dFr2020 392 S1 303, 73, 16 X2 S2 X2 P2 dFr2088 220, 179 S1 242, 171, 16 X6 S1 X6 P8 dFr2066 217, 179 S1 237, 66, 16 X7 S1 X7 P9

TABLE XIII New XhoI, SmaI, SacII restriction profiles obtained in five patients originating from West Africa SacII STEP 2 fragments SacII SmaI-XhoI/SacII PATIENTS Size (pb) profile combined profile dFr1843 267, 130 Sc3 S1 X2 Sc3 D7  dFr1953 267, 92, 38 Sc2 S1 X2 Sc2 D11 dFr2020 262, 130 Sc3 S2 X2 Sc3 D10 dFr2088 396 Sc4 S1 X6 Sc4 D12 dFr2066 396 Sc4 S1 X7 Sc4 D13

The correspondance between the combined profiles and the genotypes identified by the phylogenetic analysis is given in Table XIV.

TABLE XIV Summary of the various results based on the phylogenetic analyses and the various corresponding profiles Combined profiles (SmaI-XhoI/ Clades Genotypes Isolates SacII) HDV-1 I Italy D1A dFr2088 D1B HDV-2 IIA Japan D2A Yakutia isolates D2B HDV-3 III Peru 1 D3 HDV-4 IIB TW2b D4A IIC Miyako D4B HDV-5 V dFr47, dFr73 and D5A dFr910 dFr1953 D5B HDV-6 VI dFr48, dFr2020 D6 HDV-7 VII dFr45, dFr1843 D7A dFr2066 D7B dFr644 D7C

Bibliographical References

-   -   Casey J. L. et al., Proc. Natl. Acad. Sci. USA, 1993a, 90,         9016-20.     -   Casey J. L. et al., J. Infect. Dis., 1996b, 174, 920-6.     -   Chang F. L. et al., Proc. Natl. Acad. Sci. USA, 1991, 88,         8490-8494.     -   Chao Y. C. et al., Hepatology, 1991b, 13, 345-52.     -   Deny P. et al., Res. Virol., 1994, 145, 287-95.     -   Deny P. et al., J. Med. Virol., 1993, 39, 214-8.     -   Deny P. et al., J. Gen. Virol., 1991, 72, 735-9.     -   Felsenstein J. et al., Cladistics, 1989, 5, 164-166.     -   Gaeta G. B. et al., Hepatology, 2000, 32, 824-7.     -   Glenn J. S. et al., Science, 1992, 256, 1331-3.     -   Hwang S. et al., Virology, 1993a, 193, 924-931.     -   Imazeki F. et al., J. Virol., 1990, 64, 5594-5599.     -   Imazeki F. et al., Nucl. Acid. Res., 1991, 19, 5439-5440.     -   Lai M. M. C. et al., S. Hadziyannis, J. Taylor and F. Bonino         (ed.), Hepatitis delta virus. Molecular biology, Pathogenis, and         Clinical aspects, 1993, 382, 21-27. Wiley-Liss, New York.     -   Lee C. M. et al., Virology, 1992, 188, 265-273.     -   Lee C. M. et al., J. Med. Virol., 1996b, 49, 145-54.     -   Makino S. et al., Nature, 1987, 329, 343-6.     -   Nakano T. et al., J. Gen. Virol., 2001, 82, 2183-2189.     -   Niro G. A. et al., J. Hepatol., 1999, 30, 564-9.     -   Niro G. A. et al., Hepatology, 1997, 25, 728-34.     -   Roingeard P. et al., Clin. Infect. Dis., 1992, 14, 510-14.     -   Saitou N. et al., Mol. Biol. Evol., 1987, 4, 406-25.     -   Sakugawa H. et al., J. Med. Virol., 1999, 58, 366-72.     -   Shakil A. O. et al., Virology, 1997, 234, 160-7.     -   Swofford D. et al., PAUP*: Phylogenetic Analysis Using Parsimony         (and other methods), version 4.Od64.     -   Thompson J. D. et al., Nuc. Acid. Res., 1994, 22, 4673-80.     -   Wang K. S. et al., Nature, 1986, 323, 508-514.     -   Wang K. S. et al., Nature, 1987, 328, 456.     -   Wu J. C. et al., Hepatology, 1995a, 22, 1656-60.     -   Wu J. C. et al., J. Gen. Virol., 1998, 79, 1105-13.     -   Wu J. C. et al., Lancet, 1995b, 346, 939-41. 

1. An isolated nucleic acid molecule comprising a complete genome of an hepatitis D virus (HDV) selected from the group consisting of SEQ ID NO:1, SEQ ID NO:6, SEQ ID NO:11, SEQ ID NO:16, SEQ ID NO:21, and SEQ ID NO:26.
 2. A recombinant vector comprising the isolated nucleic acid molecule of claim
 1. 3. A cell transformed with the isolated nucleic acid molecule of claim
 1. 4. A translation product encoded by at least one of the isolated nucleic acid molecules of claim
 1. 5. A kit comprising at least one of the isolated nucleic acid molecules of claim 1 and one or more suitable reagents for detecting and genotyping HDV. 